Method and system for atomically writing scattered information in a solid state storage device

ABSTRACT

Disclosed herein are several methods and systems for handling atomic write commands that reach scattered address ranges. One embodiment includes a method of performing an operation in a data storage device, the method comprising: receiving an atomic write command; obtaining a plurality of ranges of logical addresses affected by the atomic write command; for each of the plurality of affected ranges, assigning metadata information to track completion of a write operation performed at that range; performing the write operations in the ranges of logical addresses; updating the metadata information upon completion of the write operations in the ranges; and deferring an update to a translation map of the data storage device until the metadata information has been updated.

CROSS-REFERENCE TO RELATED APPLICATION(S)

This application is a continuation of U.S. patent application Ser. No.14/060,547, filed Oct. 22, 2013, entitled “Method and System forAtomically Writing Scattered Information in a Solid State StorageDevice,” the contents of which are expressly incorporated by referenceherein in its entirety and for all purposes. U.S. patent applicationSer. No. 14/060,547, claims the benefit of U.S. provisional applicationNo. 61/824,460, filed May 17, 2013, entitled “Method and System forAtomically Writing Scatter Information in a Solid State Storage Device,”the disclosure of which is hereby incorporated in its entirety.

BACKGROUND

Due to the nature of flash memory in solid state drives (SSDs), data istypically programmed by pages and erased by blocks. A page in an SSD istypically 8-16 kilobytes (KB) in size and a block consists of a largenumber of pages (e.g., 256 or 512). Thus, a particular physical locationin an SSD (e.g., a page) cannot be directly overwritten withoutoverwriting data in pages within the same block, as is possible in amagnetic hard disk drive. As such, address indirection is needed.Conventional data storage device controllers, which manage the Flashmemory on the data storage device and interfaces with the host system,use a Logical-to-Physical (L2P) mapping system known as logical blockaddressing (LBA) that is part of the Flash translation layer (FTL). Whennew data comes in replacing older data already written, the data storagedevice controller causes the new data to be written in a new location(as the data storage device cannot directly overwrite the old data) andupdate the logical mapping to point to the new physical location. Atthis juncture, the old physical location no longer holds valid data. Assuch, the old physical location will eventually need to be erased beforeit can be written again.

Conventionally, a large L2P map table maps logical entries to physicaladdress locations on an SSD. This large L2P map table is usually savedin small sections as writes come in. For example, if random writingoccurs, although the system may have to update only one entry, it maynonetheless have to save the entire table or a portion thereof,including entries that have not been updated, which is inherentlyinefficient.

FIG. 1 shows aspects of a conventional Logical Block Addressing (LBA)scheme for data storage devices. As shown therein, a map table 104contains one entry for every logical block 102 defined for the datastorage device's Flash memory 106. For example, a 64 GB data storagedevice that supports 512 byte logical blocks may present itself to thehost as having 125,000,000 logical blocks. One entry in the map table104 contains the current location of each of the 125,000 logical blocksin the Flash memory 106. In a conventional data storage device, a Flashpage holds an integer number of logical blocks (i.e., a logical blockdoes not span across Flash pages). In this conventional example, an 8 KBFlash page would hold 16 logical blocks (of size 512 bytes). Therefore,each entry in the logical-to-physical map table 104 contains a field 108identifying the die on which the LBA is stored, a field 110 identifyingthe flash block on which the LBA is stored, another field 112identifying the flash page within the flash block and a field 114identifying the offset within the flash page that identifies where theLBA data begins in the identified Flash page. The large size of the maptable 104 prevents the table from being held inside the SSD controller.Conventionally, the large map table 104 is held in an external DRAMconnected to the SSD controller. As the map table 104 is stored involatile DRAM, it must be restored when the SSD powers up, which cantake a long time, due to the large size of the table.

When a logical block is written, the corresponding entry in the maptable 104 is updated to reflect the new location of the logical block.When a logical block is read, the corresponding entry in the map table104 is read to determine the location in Flash memory to be read. A readis then performed to the Flash page specified in the corresponding entryin the map table 104. When the read data is available for the Flashpage, the data at the offset specified by the Map Entry is transferredfrom the Flash device to the host. When a logical block is written, theFlash memory holding the “old” version of the data becomes “garbage”(i.e., data that is no longer valid). It is to be noted that when alogical block is written, the Flash memory will initially contain atleast two versions of the logical block; namely, the valid, mostrecently written version (pointed to by the map table 104) and at leastone other, older version thereof that is stale and is no longer pointedto by any entry in the map table 104. These “stale” entries are referredto as garbage, which occupies space that must be accounted for,collected, erased and made available for future use. This process isknown as “garbage collection”.

An atomic command is one in which the command is either performedcompletely or not at all. Since a power cycle is often the cause of somecommands not being able to finish, any atomic write command must takeinto account the power cycle issue. Conventional methods of implementingatomic write commands in flash-based data storage devices do not allowfor efficient detection of incompletely-processed atomic write commands,efficient garbage collection of blocks with in-process atomic writes andmeta data or rely on duplicating the atomic write data in buffers,thereby increasing write amplification, system complexity and generatingfree space accounting issues.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows aspects of a conventional Logical Block Addressing schemefor SSDs.

FIG. 2 is a block diagram of a data storage device according to oneembodiment, as well as aspects of the physical and logical dataorganization of such a data storage device.

FIG. 3 shows a logical-to-physical address translation map andillustrative entries thereof, according to one embodiment.

FIG. 4 shows aspects of a method for updating a logical-to-physicaladdress translation map and for creating an S-Journal entry, accordingto one embodiment.

FIG. 5 is a block diagram of an S-Journal, according to one embodiment.

FIG. 6 shows an exemplary organization of one entry of an S-Journal,according to one embodiment.

FIG. 7 is a block diagram of a superblock (S-Block), according to oneembodiment.

FIG. 8 shows another view of a Super page (S-page), according to oneembodiment.

FIG. 9A shows relationships between the logical-to-physical addresstranslation map, S-Journals and S-Blocks, according to one embodiment.

FIG. 9B is a block diagram of an S-Journal Map, according to oneembodiment.

FIG. 10 is a block diagram of a data structure in which atomic sequencenumbers, used in processing atomic write commands, may be stored,according to one embodiment.

FIG. 11 is a block diagram illustrating aspects of non-atomic and atomicwrites, according to one embodiment.

FIG. 12 shows aspects of an S-Journal comprising an S-Journal entry foran atomic write, according to one embodiment.

FIG. 13 is a flowchart of a method for processing atomic write commands,according to one embodiment.

FIG. 14 is a flowchart of further aspects of a method for processingatomic write commands, according to one embodiment.

FIGS. 15A-D illustrate handling of slot number assignment according toone embodiment.

FIGS. 16A-C illustrate handling of partial atomic write commandsaccording to one embodiment.

FIG. 17 shows how one embodiment handles the situations when multipledisparate LBAs to be written are specified in either one host command,or in multiple host commands that are grouped together into one atomicoperation.

FIG. 18 is a flow diagram showing the handling of atomic command(s) thatmay write scattered information according to one embodiment.

DETAILED DESCRIPTION System Overview

FIG. 2 is a diagram showing aspects of the physical and logical dataorganization of a data storage device according to one embodiment. Inone embodiment, the data storage device is an SSD. In anotherembodiment, the data storage device is a hybrid drive including Flashmemory and rotating magnetic storage media. The disclosure is applicableto both SSD and hybrid implementations, but for the sake of simplicitythe various embodiments are described with reference to SSD-basedimplementations. A data storage device controller 202 according to oneembodiment may be configured to be coupled to a host, as shown atreference numeral 218. The controller may comprise one or moreprocessors that execute some or all of the functions described below asbeing performed by the controller. The host 218 may utilize a logicalblock addressing (LBA) scheme. While the LBA size is normally fixed, thehost can vary the size of the LBA dynamically. For example, the LBA sizemay vary by interface and interface mode. Indeed, while 512 bytes ismost common, 4 KB is also becoming more common, as are 512+ (520, 528,etc.) and 4 KB+ (4 KB+8, 4K+16, etc.) formats. As shown therein, thedata storage device controller 202 may comprise or be coupled to a pageregister 204. The page register 204 may be configured to enable thecontroller 202 to read data from and store data to the data storagedevice. The controller 202 may be configured to program and read datafrom an array of flash memory devices responsive to data access commandsfrom the host 218. While the description herein refers to flash memory,it is understood that the array of memory devices may comprise othertypes of non-volatile memory devices such as flash integrated circuits,Chalcogenide RAM (C-RAM), Phase Change Memory (PC-RAM or PRAM),Programmable Metallization Cell RAM (PMC-RAM or PMCm), Ovonic UnifiedMemory (OUM), Resistance RAM (RRAM), NAND memory (e.g., single-levelcell (SLC) memory, multi-level cell (MLC) memory, or any combinationthereof), NOR memory, EEPROM, Ferroelectric Memory (FeRAM),Magnetoresistive RAM (MRAM), other discrete NVM (non-volatile memory)chips, or any combination thereof.

The page register 204 may be configured to enable the controller 202 toread data from and store data to the array. According to one embodiment,the array of flash memory devices may comprise a plurality ofnon-volatile memory devices in die (e.g., 128 dies), each of whichcomprises a plurality of blocks, such as shown at 206 in FIG. 2. Otherpage registers 204 (not shown), may be coupled to blocks on other die. Acombination of Flash blocks, grouped together, may be called aSuperblock or S-Block. In some embodiments, the individual blocks thatform an S-Block may be chosen from one or more dies, planes or otherlevels of granularity. An S-Block, therefore, may comprise a pluralityof Flash blocks, spread across one or more die, that are combinedtogether. In this manner, the S-Block may form a unit on which the FlashManagement System (FMS) operates. In some embodiments, the individualblocks that form an S-Block may be chosen according to a differentgranularity than at the die level, such as the case when the memorydevices include dies that are sub-divided into structures such as planes(i.e., blocks may be taken from individual planes). According to oneembodiment, allocation, erasure and garbage collection may be carriedout at the S-Block level. In other embodiments, the FMS may perform dataoperations according to other logical groupings such as pages, blocks,planes, dies, etc.

In turn, each of the Flash blocks 206 comprises a plurality of Flashpages (F-Pages) 208. Each F-Page may be of a fixed size such as, forexample, 16 KB. The F-Page, according to one embodiment, is the size ofthe minimum unit of program for a given Flash device. As shown in FIG.3, each F-Page 208 may be configured to accommodate a plurality ofphysical pages, hereinafter referred to as E-Pages 210. The term“E-Page” refers to a data structure stored in Flash memory on which anerror correcting code (ECC) has been applied. According to oneembodiment, the E-Page 210 may form the basis for physical addressingwithin the data storage device and may constitute the minimum unit ofFlash read data transfer. The E-Page 210, therefore, may be (but neednot be) of a predetermined fixed size (such as 2 KB, for example) anddetermine the size of the payload (e.g., host data) of the ECC system.According to one embodiment, each F-Page 208 may be configured to fit apredetermined plurality of E-Pages 210 within its boundaries. Forexample, given 16 KB size F-Pages 208 and a fixed size of 2 KB perE-Page 210, eight E-Pages 210 fit within a single F-Page 208, as shownin FIG. 3. In any event, according to one embodiment, a power of 2multiple of E-Pages 210, including ECC, may be configured to fit into anF-Page 208. Each E-Page 210 may comprise a data portion 214 and,depending on where the E-Page 210 is located, may also comprise an ECCportion 216. Neither the data portion 214 nor the ECC portion 216 needbe fixed in size. The address of an E-Page uniquely identifies thelocation of the E-Page within the Flash memory. For example, theE-Page's address may specify the Flash channel, a particular die withinthe identified Flash channel, a particular block within the die, aparticular F-Page and, finally, the E-Page within the identified F-Page.

To bridge between physical addressing on the data storage device andlogical block addressing by the host, a logical page (L-Page) constructis introduced. An L-Page, denoted in FIG. 3 at reference numeral 212 maycomprise the minimum unit of address translation used by the FMS. EachL-Page, according to one embodiment, may be associated with an L-Pagenumber. The L-Page numbers of L-Pages 212, therefore, may be configuredto enable the controller 202 to logically reference host data stored inone or more of the physical pages, such as the E-Pages 210. The L-Page212 may also be utilized as the basic unit of compression. According toone embodiment, unlike F-Pages 208 and E-Pages 210, L-Pages 212 are notfixed in size and may vary in size, due to variability in thecompression of data to be stored. Since the compressibility of datavaries, a 4 KB amount of data of one type may be compressed into a 2 KBL-Page while a 4 KB amount of data of a different type may be compressedinto a 1 KB L-Page, for example. Due to such compression, therefore, thesize of L-Pages may vary within a range defined by a minimum compressedsize of, for example, 24 bytes to a maximum uncompressed size of, forexample, 4 KB or 4 KB+. Other sizes and ranges may be implemented. Asshown in FIG. 3, L-Pages 212 need not be aligned with the boundaries ofE-Page 210. Indeed, L-Pages 212 may be configured to have a startingaddress that is aligned with an F-Page 208 and/or E-Page 210 boundary,but also may be configured to be unaligned with either of the boundariesof an F-Page 208 or E-Page 210. That is, an L-Page starting address maybe located at a non-zero offset from either the start or endingaddresses of the F-Pages 208 or the start or ending addresses of theE-Pages 210, as shown in FIG. 3. As the L-Pages 212 are not fixed insize and may be smaller than the fixed-size E-Pages 210, more than oneL-Page 212 may fit within a single E-Page 210. Similarly, as the L-Pages212 may be larger in size than the E-Pages 210, L-Pages 212 may spanmore than one E-Page, and may even cross the boundaries of F-Pages 208,shown in FIG. 3 at numeral 111.

For example, where the LBA size is 512 or 512+ bytes, a maximum of, forexample, eight sequential LBAs may be packed into a 4 KB L-Page 212,given that an uncompressed L-Page 212 may be 4 KB to 4 KB+. It is to benoted that, according to one embodiment, the exact logical size of anL-Page 212 is unimportant as, after compression, the physical size mayspan from few bytes at minimum size to thousands of bytes at full size.For example, for 4TB SSD device, 30 bits of addressing may be used toaddress each L-Page 212 that could potentially be present in such a SSD.

To mitigate against lower page corruption errors, one embodimentutilizes a non-volatile buffer to temporarily store updated L-Pages atleast until both the lower and upper pages of each MLC are programed.Additionally details related to the use of such a buffer are provided incommonly-assigned and co-pending U.S. patent application Ser. No.13/675,913 filed on Nov. 13, 2012 (Atty. Docket No. T5960), thedisclosure of which is hereby incorporated herein in its entirety. Sucha non-volatile buffer is shown in FIG. 2 at reference numeral 211. Forexample, the non-volatile buffer 211 may comprise most any power-safememory such as, for example, Magnetic Random Access Memory (MRAM), whichoperates at speeds comparable to DRAM while being storing data in anon-volatile manner. A portion of buffer 211 may be used in supportingatomic write commands, as will be further described starting with FIG.10.

Address Translation Map and Related Data Structures

FIG. 3 shows a logical-to-physical address translation map andillustrative entries thereof, according to one embodiment. As the hostdata is referenced by the host in L-Pages 212 and as the data storagedevice stores the L-Pages 212 in one or more contiguous E-Pages 210, alogical-to-physical address translation map is required to enable thecontroller 202 to associate an L-Page number of an L-Page 212 to one ormore E-Pages 210. Such a logical-to-physical address translation map isshown in FIG. 3 at 302 and, in one embodiment, is a linear array havingone entry per L-Page 212. Such a logical-to-physical address translationmap 302 may be stored in a volatile memory 306, such as a DRAM or SRAM.FIG. 3 also shows the entries in the logical-to-physical addresstranslation map for four different L-Pages 212, which L-Pages 212 inFIG. 3 are associated with L-Page numbers denoted as L-Page 1, L-Page 2,L-Page 3 and L-Page 4. According to one embodiment, each L-Page storedin the data storage device may be pointed to by a single and uniqueentry in the logical-to-physical address translation map 302.Accordingly, in the example being developed herewith, four entries areshown. As shown at 302, each entry in the map 302 may comprise an L-Pagenumber, which may comprise an identification of the physical page (e.g.,E-Page) containing the start address of the L-Page being referenced, theoffset of the start address within the physical page (e.g., E-Page) andthe length of the L-Page. In addition, a plurality of ECC bits mayprovide error correction functionality for the map entry. For example,and as shown in FIG. 3, and assuming an E-Page size of 2 KB, L-Page 1may be referenced in the logical-to-physical address translation map 302as follows: E-Page 1003, offset 800, length 1624, followed by apredetermined number of ECC bits (not shown). That is, in physicaladdress terms, the start of L-Page 1 is within (not aligned with) E-Page1003, and is located at an offset from the starting physical location ofthe E-Page 1003 that is equal to 800 bytes. Compressed L-Page 1,furthermore, extends 1,624 bytes, thereby crossing an E-Page boundary toE-Page 1004. Therefore, E-Pages 1003 and 1004 each store a portion ofthe L-Page 212 denoted by L-Page number L-Page 1. Similarly, thecompressed L-Page referenced by L-Page number L-Page 2 is storedentirely within E-Page 1004, and begins at an offset therein of 400bytes and extends only 696 bytes within E-Page 1004. The compressedL-Page associated with L-Page number L-Page 3 starts within E-Page 1004at an offset of 1,120 bytes (just 24 bytes away from the boundary ofL-Page 2) and extends 4,096 bytes past E-Page 1005 and into E-Page 1006.Therefore, the L-Page associated with L-Page number L-Page 3 spans aportion of E-Page 1004, all of E-Page 1005 and a portion of E-Page 1006.Finally, the L-Page associated with L-Page number L-Page 4 begins withinE-Page 1006 at an offset of 1,144 bytes, and extends 3,128 bytes tofully span E-Page 1007, crossing an F-Page boundary into E-Page 1008 ofthe next F-Page.

Collectively, each of these constituent identifier fields (E-Page,offset, length and ECC) making up each entry of the logical-to-physicaladdress translation map 302 may be, for example, 8 bytes in size. Thatis, for an exemplary 4 TB drive, the address of the E-Page may be 32bits in size, the offset may be 12 bits (for E-Page data portions up to4 KB) in size, the length may be 10 bits in size and the ECC field maybe provided. Other organizations and bit-widths are possible. Such an 8byte entry may be created each time an L-Page is written or modified, toenable the controller 202 to keep track of the host data, written inL-Pages, within the Flash storage. This 8-byte entry in thelogical-to-physical address translation map may be indexed by an L-Pagenumber or LPN. In other words, according to one embodiment, the L-Pagenumber functions as an index into the logical-to-physical addresstranslation map 302. It is to be noted that, in the case of a 4 KBsector size, the LBA is the same as the LPN. The LPN, therefore, mayconstitute the address of the entry within the volatile memory. When thecontroller 202 receives a read command from the host 218, the LPN may bederived from the supplied LBA and used to index into thelogical-to-physical address translation map 302 to extract the locationof the data to be read in the Flash memory. When the controller 202receives a write command from the host, the LPN may be constructed fromthe LBA and the logical-to-physical address translation map 302 may bemodified. For example, a new entry therein may be created. Dependingupon the size of the volatile memory storing the logical-to-physicaladdress translation map 302, the LPN may be stored in a single entry orbroken into, for example, a first entry identifying the E-Pagecontaining the starting address of the L-Page in question (plus ECCbits) and a second entry identifying the offset and length (plus ECCbits). According to one embodiment, therefore, these two entries maytogether correspond and point to a single L-Page within the Flashmemory. In other embodiments, the specific format of thelogical-to-physical address translation map entries may be differentfrom the examples shown above.

As the logical-to-physical address translation map 302 may be stored ina volatile memory, it necessarily must be rebuilt upon startup or anyother loss of power to the volatile memory. This, therefore, requiressome mechanism and information to be stored in a non-volatile memorythat will enable the controller 202 to reconstruct thelogical-to-physical address translation map 302 before the controllercan “know” where the L-Pages are stored in the non-volatile memorydevices after startup or after a power-fail event. According to oneembodiment, such mechanism and information may be embodied in aconstruct that may be called a System Journal, or S-Journal. Accordingto one embodiment, the controller 202 may be configured to maintain, inthe plurality of non-volatile memory devices (e.g., in one or more ofthe blocks 206 in one or more die, channel or plane), a plurality ofS-Journals defining physical-to-logical address correspondences.According to one embodiment, each S-Journal may cover a pre-determinedrange of physical pages (e.g., E-Pages). According to one embodiment,each S-Journal may comprise a plurality of journal entries, with eachentry being configured to associate one or more physical pages, such asE-Pages, to the L-Page number of each L-Page. According to oneembodiment, each time the controller 202 restarts or whenever thelogical-to-physical address translation map 302 must be rebuilt, thecontroller 202 reads the S-Journals and, from the information read fromthe S-Journal entries, rebuilds the logical-to-physical addresstranslation map 302.

FIG. 4 shows aspects of a method for updating a logical-to-physicaladdress translation map and for creating an S-Journal entry, accordingto one embodiment. As shown therein, to ensure that thelogical-to-physical address translation map 302 is kept up-to-date,whenever an L-Page is written or otherwise updated as shown at blockB41, the logical-to-physical address translation map 302 may be updatedas shown at B42. As shown at B43, an S-Journal entry may also becreated, storing therein information pointing to the location of theupdated L-Page. In this manner, both the logical-to-physical addresstranslation map 302 and the S-Journals are updated when new writes occur(e.g., as the host issues writes to the non-volatile memory devices, asgarbage collection/wear leveling occurs, etc.). Write operations to thenon-volatile memory devices to maintain a power-safe copy of addresstranslation data may be configured, therefore, to be triggered by newlycreated S-Journal entries (which may be just a few bytes in size)instead of re-saving all or a portion of the logical-to-physical addresstranslation map, such that Write Amplification (WA) is reduced. Theupdating of the S-Journals ensure that the controller 202 can access anewly updated L-Page and that the logical-to-physical addresstranslation map 302 may be reconstructed upon restart or otherinformation-erasing power event affecting the non-volatile memorydevices in which the logical-to-physical address translation map isstored. Moreover, in addition to their utility in rebuilding thelogical-to-physical address translation map 302, the S-Journals areuseful in enabling effective Garbage Collection (GC). Indeed, theS-Journals may contain the last-in-time update to all L-Page numbers,and may also contain stale entries, entries that do not point to a validL-Page.

According to one embodiment, the S-Journal may constitute the main flashmanagement data written to the media. According to one embodiment,S-Journals may contain mapping information for a given S-Block and maycontain the Physical-to-Logical (P2L) information for a given S-Block.FIG. 5 is a block diagram showing aspects of an S-Journal, according toone embodiment. As shown therein and according to one embodiment, eachS-Journal 502 covers a predetermined physical region of the non-volatilememory devices (e.g., Flash) such as, for example, 32 E-Pages as shownat 506, which are addressable using 5 bits. Each S-Journal 502 may beidentified by an S-Journal Number 504. The S-Journal Number 504 used forstoring P2L information for host data may comprise a portion of theaddress of the first physical page covered by the S-Journal. Forexample, the S-Journal Number of S-Journal 502 may comprise, forexample, the 27 MSbs of the first E-Page covered by this S-Journal 502.

FIG. 6 shows an exemplary organization of one entry 602 of an S-Journal502, according to one embodiment. Each entry 602 of the S-Journal 502may point to the starting address of one L-Page, which is physicallyaddressed in E-Pages. Each entry 602 may comprise, for example, a number(5, for example) of LSbs of the E-Page containing the starting E-Page ofthe L-Page. The full E-Page address may be obtained by concatenatingthese 5 LSbs with the 27 MSbs of the S-Journal Number in the header. Theentry 602 may then comprise the L-Page number, its offset within theidentified E-Page and its size. For example, each entry 602 of S-Journal502 may comprise the 5 LSbs of the first E-Page covered by thisS-Journal entry, 30 bits of L-Page number, 9 bits of E-Page offset and10 bits of L-Page size, adding up to an overall size of about 7 bytes.Various other internal journal entry formats may be used in otherembodiments.

According to one embodiment, due to the variability in the compressionor the host configuration of the data stored in L-Pages, a variablenumber of L-Pages may be stored in a physical area, such as a physicalarea equal to 32 E-Pages, as shown at 506 in FIG. 5. As a result of theuse of compression and the consequent variability in the sizes ofL-Pages, S-Journals 502 may comprise a variable number of entries. Forexample, according to one embodiment, at maximum compression, an L-Pagemay be 24 bytes in size and an S-Journal 502 may comprise over 2,500entries, referencing an equal number of L-Pages, one L-Page perS-Journal entry 602.

As noted above, S-Journals 502 may be configured to contain mappinginformation for a given S-Block and may contain the P2L information fora given S-Block. More precisely, according to one embodiment, S-Journals502 may contain the mapping information for a predetermined range ofE-Pages within a given S-Block. FIG. 7 is a block diagram of asuperblock (S-Block), according to one embodiment. As shown therein, anS-Block 702 may comprise one Flash block (F-Block) 704 (as also shown at206 in FIG. 2) per die. An S-Block 702, therefore, may be thought of asa collection of F-Blocks 704, one F-Block per die, that are combinedtogether to form a unit of the Flash Management System. According to oneembodiment, allocation, erasure and GC may be managed at the Superblocklevel. Each F-Block 704, as shown in FIG. 7, may comprise a plurality ofFlash pages (F-Page) such as, for example, 256 or 512 F-Pages. AnF-Page, according to one embodiment, may be the size of the minimum unitof program for a given non-volatile memory device. FIG. 8 shows a SuperPage (S-Page), according to one embodiment. As shown therein, an S-Page802 may comprise one F-Page per block of an S-Block, meaning that anS-Page 802 spans across an entire S-Block 702.

FIG. 9A shows relationships between the logical-to-physical addresstranslation map, S-Journals and S-Blocks, according to one embodiment.Reference 902 denotes the logical-to-physical address translation map.According to one embodiment, the logical-to-physical address translationmap 902 may be indexed by L-Page number, in that there may be one entryin the logical-to-physical address translation map 902 per L-Page in thelogical-to-physical address translation map. The physical address of thestart of the L-Page in the Flash memory and the size thereof may begiven in the map entry; namely by E-Page address, offset within theE-Page and the size of the L-Page. As noted earlier, the L-Page,depending upon its size, may span one or more E-Pages and may spanF-Pages and blocks as well.

As shown at 904, the volatile memory (e.g., DRAM) may also store anS-Journal map. An entry in the S-Journal map 904 stores informationrelated to where an S-Journal is physically located in the non-volatilememory devices. For example, the 27 MSbs of the E-Page physical addresswhere the start of the L-Page is stored may constitute the S-JournalNumber. The S-Journal map 904 in the volatile memory may also includethe address of the S-Journal in the non-volatile memory devices,referenced in system E-Pages. From the E-Page referenced in an entry ofthe S-Journal map 904 in volatile memory, an index to the System S-BlockInformation 908 may be extracted. The System S-Block Information 908 maybe indexed by System S-Block (S-Block in the System Band) and maycomprise, among other information regarding the S-Block, the size of anyfree or used space in the System S-Block. Also from the S-Journal map904, the physical location of the S-Journals 910 in the non-volatilememory devices may be extracted.

The System Band, according to one embodiment, does not contain L-Pagedata and may contain all File Management System (FMS) meta-data andinformation. The System Band may be configured as lower-page only forreliability and power fail simplification. During normal operation, theSystem Band need not be read except during Garbage Collection. Accordingto one embodiment, the System Band may be provided with significantlyhigher overprovisioning than the data band for overall WA optimization.Other bands may include the Hot Band, which may contain L-Page data andis frequently updated, and the Cold Band, which is a physical area ofmemory storing static data retained from the garbage collection process,which may be infrequently updated. According to one embodiment, theSystem, Hot and Cold Bands may be allocated by controller firmware on anS-Block basis.

As noted above, each of these S-Journals in the non-volatile memorydevices may comprise a collection of S-Journal entries and cover, forexample, 32 E-Pages worth of data. These S-Journals 910 in thenon-volatile memory devices enable the controller 202 to access theS-Journals entries in the non-volatile memory devices upon startup,enable the controller 202 to rebuild in volatile memory not only thelogical-to-physical address translation map 902, but also the S-Journalmap 904, the User S-Block Information 906, and the System S-BlockInformation 908.

The S-Journals in the non-volatile memory devices may also contain allof the stale L-Page information, which may be ignored during garbagecollection after the logical-to-physical address translation map 902 andthe S-Journal Map 904 in volatile memory are rebuilt. The S-Journals,therefore, may be said to contain a sequential history of all currentlyvalid updates as well as some stale updates to the logical-to-physicaladdress translation map 902.

FIG. 9B is a block diagram of another view of an S-Journal Map 904,according to one embodiment. The S-Journal Map 904 may reference aplurality of S-Journal entries for each S-Block. According to oneembodiment, the S-Block Number may be the MSb of the S-Journal Number.The size of the S-Journal map 904 may be correlated to the number ofS-Blocks times the number of S-Journal entries per S-Block. Indexinginto the S-Journal Map 904, therefore, may be carried out by referencingthe S-Block Number (the MSb of the S-Journal Number) and the S-Journalentry for that S-Block number. The controller 202 may be furtherconfigured to build or rebuild a map of the S-Journals and store theresulting S-Journal Map 904 in volatile memory. For example, uponrestart or upon the occurrence of another event in which power fails orafter a restart subsequent to error recovery, the controller 202 mayread the plurality of S-Journals in a predetermined sequential order,build a map of the S-Journals stored in the non-volatile memory devicesbased upon the sequentially read plurality of S-Journals, and store thebuilt S-Journal Map 904 in the volatile memory. In particular, therebuilt S-Journal Map 904 may be configured to contain the physicallocation for the most recently-written version of each S-Journal.Indeed, according to one embodiment, in rebuilding the S-Journal Map904, the physical location of older S-Journals may be overwritten when anewer S-Journal is found. Stated differently, according to oneembodiment, the S-Journal Map 904 may be rebuilt by the controller 202based upon read S-Journals that are determined to be valid.

Atomic Write Commands

In one embodiment, to maintain the coherency of the logical-to-physicaladdress translation map and to provide a mechanism for recovering fromunsuccessful (incomplete) atomic write commands, the original entry orentries in the logical-to-physical address translation map shouldpreferably be maintained until such time as the atomic write command isdetermined to have been successfully completed. Such a mechanism shouldenable a determination of an unsuccessful atomic write command, even inthe presence of an intervening power fail event and must safeguardaccess to the original data stored in the non-volatile memory devices.According to one embodiment, atomic sequence numbers may be used forthis purpose.

FIG. 10 is a block diagram of an atomic sequence number table datastructure 1002 that may be stored in, for example, in power-safe memoryin which atomic sequence numbers 1011, used in processing atomic writecommands, may be stored, according to one embodiment. According to oneembodiment, the power safe memory may comprise volatile memory with abattery back-up, a volatile memory that is safely stored to thenon-volatile memory devices upon power-down and safely restored to thevolatile memory upon power-up, or MRAM, for example. As shown in FIG.10, the unique atomic sequence numbers 1011 each may be stored in anatomic slot 1015. According to one embodiment, each atomic slot 1015 maybe associated with, for error recovery purposes, a value of a CRC 1013applied to its atomic sequence number. For example, a 16-bit CRC 1013may be generated to ensure proper recovery of the atomic sequence numberin case of corruption thereof. Each L-Page of an atomic write commandmay, according to one embodiment, be associated with one of the atomicslots 1015, each of which may be configured to store the atomic sequencenumber corresponding to the atomic write command. Such slots 1015 may befreed once it is determined that the atomic write command hassuccessfully completed. The freed slots may then be added to a free slotlist and re-used during subsequent atomic write commands.

In one embodiment, the atomic sequence numbers may be unique withrespect to individual slots, or a sub-group of slots. For example, allthe atomic sequence numbers may be unique to slot 0, but non-unique withrespect to the atomic sequence numbers used for slot 1. A bit or a flagvalue may be used to indicate different groupings within whichuniqueness is guaranteed.

In addition, in one embodiment, the slots may be used in such a way asto prevent the same slot from being used for consecutive sequencenumbers. The scheme prevents a scenario in which several consecutivesequence numbers may be used in the same slot. Under such a case, ifpower loss occurs and writing of the sequence number to a slot becomescorrupted, then upon power-up it cannot be determined what the maximumsequence No. was in use before the power loss. If the maximum sequenceno. cannot be determined, then the uniqueness of the sequence numbersassigned cannot be guaranteed.

An example scheme of ensuring that consecutive sequence numbers are notused in the same slot is shown in FIGS. 15A-D. In FIG. 15A, five of thesix slots are in use and the next sequence no. 10006 is assigned to thefree slot no. 4, which is shown in FIG. 15B. In FIG. 15C, slot no. 4 isblocked from being assigned to the next sequence number, which is 10007,since it was just assigned to the prior sequence no. 10006. This is thecase even if the atomic command associated with sequence no. 10006completes before the other commands and slot no. 4 becomes the firstfree slot. However, sequence no. 10007 can use any other slots. So inFIG. 15D, sequence no. 10007 is assigned to slot no. 5 when it becomesavailable. Now slot no. 5 becomes off-limit to sequence no. 10008, andso on. This scheme ensures that the maximum sequence no. already usedcan be determined with certainty. If FIG. 15D reflects the condition ofthe slots encountered at power-up, the next sequence number to be usedmay be the maximum one encountered (which is 10007 in this case), plussome offset (e.g., two, so that 10007+2=10009). This ensures that thenext sequence number assigned is unique and hasn't been used before thepower cycle.

The atomic sequence numbers, according to one embodiment, may be used tofilter out partial (e.g., in-process or interrupted) atomic writesduring reconstruction of the logical-to-physical address translation mapafter a shutdown or other power-fail event. In one embodiment, thefiltering is enabled by associating persistent mapping information(e.g., S-Journal entries) of atomic write commands with an atomicsequence number that is present in the power-safe memory until thecommand is completed. In one embodiment, that associated atomic numberis changed as a commit step to signify the completion of the atomicwrite command, and upon reconstruction of the mapping table, the absenceof a matching sequence number in the power-safe memory signifies thatthe associated persistent mapping information relates to a completedatomic write command.

To ensure that the atomic sequence number for an atomic write number isnot affected by such power-fail event, it may be, according to oneembodiment, stored in a power-safe memory that may be consulted duringreconstruction of the logical-to-physical address translation map.According to one embodiment, the power-safe memory may comprise an MRAMor, for example, a battery-backed RAM or some other form of non-volatileRAM. The atomic sequence number stored therein may be relied on as areliable indicator of whether an atomic write command successfullycompleted or not. To do so, the atomic sequence number may be configuredto be unique. According to one embodiment, the atomic sequence numbermay be configured to be non-repeating over a projected lifetime of thedata storage device. For example, the unique sequence number maycomprise a large sequence of bits, each combination of which is usedonly once. For example, the large sequence of bits may be initialized toall 1's or all 0's and either decremented or incremented upon eachoccurrence of an atomic write. For example, for a representative 2 TBdrive and 4 KB L-Pages (maximum uncompressed size, according to oneembodiment, of an L-Page), a sequence number of 48 bits would be morethan sufficient to provide 512K unique sequence numbers every second fora period of 5 years.

According to one embodiment, the physical-to-logical mapping shown anddescribed herein may be modified to accommodate atomic write commands.Indeed, as described above, data may be stored in a plurality ofL-Pages, each which being associated with a logical address. Thelogical-to-physical address translation map, maintained in the volatilememory, continues to enable determination of the physical location,within one or more of the physical pages, of the data referenced by eachlogical address. It is recalled that, for a non-atomic command, dataspecified by such non-atomic write command is stored in one or moreL-Pages and that the logical-to-physical address translation map isupdated after each L-Page of non-atomic data is written.

Keeping the foregoing in mind, according to one embodiment, such aprocess may be modified for atomic write commands. Indeed, upon receiptof an atomic write command, the data specified by the atomic writecommand may be stored in one or more L-Pages, as is the case fornon-atomic writes. For atomic writes, however, the logical-to-physicaladdress translation map is not, according to one embodiment, updatedafter each L-Page of atomic write data. Instead, the update to thelogical-to-physical address translation map may be deferred until allL-Pages storing data specified by the atomic write command have beenwritten in a power-safe manner.

Prior to updating the logical-to-physical address translation map,mapping information related to the atomic write command may be writtenin volatile memory. According to one embodiment, such mappinginformation may comprise an indication of the physical location, in thenon-volatile memory devices, of each L-Page storing the data specifiedby the atomic write command. Specifically, according to one embodiment,the mapping information for the L-Pages storing the data specified bythe atomic write command may comprise the equivalent of alogical-to-physical address translation map entry. Such entry, accordingto one embodiment, may be stored separately from other entries in thelogical-to-physical address translation map 302 in volatile memory, asthe logical-to-physical address translation map may not be updated untilall data specified by the atomic write command has been written in apower-safe manner.

FIG. 11 shows the manner, according to one embodiment, in which thisindication of the physical location of L-Pages storing data specified bythe atomic write command may be stored in the volatile memory. FIG. 11shows an entry 1106 in a logical-to-physical address translation map andthe location 1110 in the non-volatile memory devices where E-Page(s)storing such L-Page are stored, for a non-atomic write, according to oneembodiment. FIG. 11 also shows the indication 1108 of the physicallocation 1112 of L-Page(s) storing data specified by the atomic writecommand, according to one embodiment. Note that although the dataspecified by the atomic write command is written to the non-volatilememory devices in the manner described above, the logical-to-physicalmapping may be carried out differently.

The entry 1106 in the logical-to-physical address translation map(corresponding to mapping information for a non-atomic write command)may conform, for example, to the format specified in FIG. 3, and maycomprise an 8 byte LPN. The indication 1108 of the physical location1112 of the L-Page(s) storing the data specified by the atomic writecommand (the mapping information corresponding to an atomic writecommand), on the other hand is not, according to one embodiment, anentry in the logical-to-physical address translation map. Although thisindication 1108 may have the same format (E-Page+Offset+Length, forexample) as the logical-to-physical address translation map entriesshown in FIG. 3, such indication 1108 may not, according to oneembodiment, be stored in the logical-to-physical address translationmap. Indeed, according to one embodiment, such indication 1108 may bestored separately from the logical-to-physical address translation map.

As shown in FIG. 11, the logical-to-physical address translation map maybe configured to store mapping entries spanning at least a portion of alogical capacity of the data storage device (e.g., 2 TB or some fractionthereof for a 2 TB drive data storage device). This is referred to, inFIG. 11, as the normal range 1114. The normal range 1114, therefore, maybe configured to contain mapping entries of the logical-to-physicaladdress translation map that map LBAs to physical locations within thenon-volatile memory devices from a 0^(th) L-Page to a Max L-Page; thatis, up to the maximum storage capacity of the data storage device. Theatomic range 1116 may begin, according to one embodiment, beyond thenormal range at, according to one embodiment, Max L-Page +1. Therefore,writing the indication 1108 of the physical location 1112 of L-Page(s)storing data specified by the atomic write command to the atomic range1116 does not constitute an update to the logical-to-physical addresstranslation map. It is to be understood, therefore, that this indication1108 of the physical location 1112 of L-Page(s) storing data specifiedby the atomic write command may be written to any other portion of avolatile memory, whether the same volatile memory storing thelogical-to-physical address translation map or not. Therefore, theindication 1108 of the physical location 1112 of L-Page(s) storing dataspecified by the atomic write command may be stored, for example, in anarea of the volatile memory 1102 other than that portion thereof storingthe logical-to-physical address translation map, or to another volatilememory altogether. The indication 1108 of the physical location 1112 ofL-Page(s) storing data specified by the atomic write command may, in thesame manner as an entry in the logical-to-physical address translationmap, point to the physical location 1112, in the non-volatile memorydevices 1104, where such L-Pages are stored.

According to one embodiment, after all L-Pages storing the dataspecified by the atomic write command are written, thelogical-to-physical address translation map may be updated with theindication of the physical location 1112 of L-Page(s) storing dataspecified by the atomic write command. That is, according to oneembodiment, it is only when the L-Page(s) storing data specified by theatomic write command have been written in a power safe manner that thelogical-to-physical address translation map may be updated with theindication 1108 of the physical location 1112 of L-Page(s) storing dataspecified by the atomic write command. For example, the correspondingentry 1108 in the atomic range 1116 may be copied to the normal range1114, which updates the logical-to-physical address translation map.Note that the physical location 1112 in the non-volatile memory devices1104 corresponding to the L-Page(s) storing data specified by the atomicwrite command does not change, as only the location of the indication1108 (i.e., the mapping information) changes—not the physical locationof the data pointed thereto in the non-volatile memory devices.

According to one embodiment, after the logical-to-physical addresstranslation map has been updated, the atomic write command may beconsidered to be effectively complete. At that time, the successfulcompletion of the atomic write command may be acknowledged to the host,as all of the data specified thereby has been stored in a power safemanner and as the logical-to-physical address translation map has beensuccessfully updated, thereby maintaining the coherency of thelogical-to-physical address translation map, even in the event of apower cycle.

As noted above, according to one embodiment, it is only when all of theL-Page(s) storing data specified by the atomic write command have beenwritten in a power safe manner that the logical-to-physical addresstranslation map may be updated with the indication 1108 of the physicallocation 1112 of L-Page(s) storing data specified by the atomic writecommand. According to one embodiment, to determine whether all L-Pagesstoring data specified by the atomic write command have been stored in apower-safe manner, one embodiment comprises modifying S-Journal entriesfor atomic write commands. Recall that S-Journals definephysical-to-logical address correspondences, with each S-Journalcomprising a plurality of entries that are configured to associate oneor more physical pages to each L-Page. According to one embodiment,S-Journal entries for L-Pages storing data specified by an atomic writecommand are configured to form part of a mechanism to enable adetermination of whether the atomic write command was completed or notcompleted, upon reconstruction of the logical-to-physical addresstranslation map. Such reconstruction of the logical-to-physical addresstranslation map may have been necessitated, for example, upon occurrenceof a power fail event, which event necessitates reconstructing thelogical-to-physical address translation map. Indeed, if the power failevent occurred while the controller 202 was processing an atomic writecommand, all of the L-Pages storing data specified by the atomic writecommand may or may not have been stored in a power-safe manner.Moreover, in the event of a power cycle, the indication 1108 of thephysical location 1112 of L-Page(s) storing data specified by the atomicwrite command is no longer available; as such indication was stored involatile memory. The corresponding S-Journal entries, modified for theatomic write command may, according to one embodiment, provide part of apersistent mechanism for determining whether the atomic writesuccessfully completed or not prior to the power fail event.

According to one embodiment, by reference to the S-Journal entry orentries for the L-Page(s) storing data specified by the atomic writecommand and the unique sequence number stored in the power-safe memoryfor that atomic write command, the controller 202 may determine whetherthe atomic write command was successfully completed. According to oneembodiment, if the atomic write command is determined to not havecompleted successfully, the corresponding S-Journal entry or entries arenot used during reconstruction of the logical-to-physical addresstranslation map, thereby maintaining its coherency and ensuring that theall-or-nothing aspect of atomic writes is respected. If, however,reference to the S-Journal entry or entries for the L-Page(s) storingdata specified by the atomic write command and the atomic sequencenumber stored in the power-safe memory for that atomic write commandindicates that the atomic write command did, in fact, completesuccessfully, the corresponding S-Journal entry or entries may be safelyused to reconstruct the logical-to-physical address translation map.

According to one embodiment, each entry 1210 of an S-Journal 1202 for anatomic write command may comprise, in addition to the indication of thelocation, within the non-volatile memory devices, of one L-Page storingdate specified by the atomic write command (shown in FIG. 12 as L-Page1206), a unique sequence number, such as shown at 1208 in FIG. 12. Asalso shown at 1208 in FIG. 12, in addition to the atomic sequencenumber, each entry 1210 of an S-Journal 1202 for an atomic write commandmay also comprise, according to one embodiment, a slot number. Accordingto one embodiment, this slot number may correspond to one of a pluralityof slots defined in the power-safe memory (e.g., MRAM, battery-backedRAM or NVRAM). Such power-safe memory is shown in FIG. 2 at 211. Theindication of the location, within the non-volatile memory devices, ofone L-Page storing data specified by the atomic write command (L-Page1206) may comprise, according to one embodiment, an atomic headerspecifically identifying that entry as having been made during an atomicwrite command. For example, such header may comprise, in addition to thenormal header of a non-atomic write entry (for example, the 5 LSbs ofthe E-Page containing the starting E-Page of the L-Page concatenatedwith the 27 MSbs of the S-Journal Number 1204), the atomic slot numberand the atomic sequence number for that slot. Other organizations arepossible.

According to one embodiment, for each atomic write command, one of thenon-repeating atomic sequence numbers may be generated and saved in oneof the plurality of slots in the power-safe temporary storage. Accordingto one embodiment, for each atomic write command, each slot definedwithin the power-safe temporary storage may store the same uniquesequence number. That same unique sequence number is also saved withineach entry or entries 1210 of the S-Journal or S-Journals comprisingentries for the L-Page or L-Pages storing data specified by the atomicwrite command. According to one embodiment, it is only when the atomicwrite command has completed that the unique sequence number stored in aslot defined in the power-safe temporary storage is changed, indicatinga commit of the atomic write command. According to one embodiment, thechanging of the unique sequence number associated with the atomic writecommand, indicative of a completed atomic write command, is carried outbefore acknowledging the completion of the atomic write command to ahost 218.

This changed atomic sequence number, at this point in time, correspondsto and may be used by a next-occurring atomic write command. Thechanging of the unique sequence number associated with the atomic writecommand may comprise, for example, incrementing or decrementing thecurrent sequence number. The changing of the atomic sequence number inthe power-safe temporary storage, therefore, may serve as the remainingportion of the mechanism for determining whether a given atomic writecommand has successfully completed. Indeed, according to one embodiment,the controller 202 may determine whether the atomic write command hascompleted during reconstruction of the translation map (and thus whetherto update the logical to physical translation map with the S-Journalentry for the L-Pages specified by the atomic write) by comparing theunique sequence number stored in the S-Journal entry or entries for thatatomic write command with the unique sequence number stored in thepower-safe temporary storage.

As the unique sequence number is only changed upon successfullycompleting the atomic write command, finding an identical uniquesequence number in the S-Journal entry corresponding to an L-Pagespecified by an atomic write command and in the power-safe temporarystorage is indicative of the corresponding atomic write command nothaving completed successfully. That is, a match between the uniquesequence number stored in the S-Journal entry or entries for the atomicwrite command and the unique sequence number stored in the power-safetemporary storage indicates an incomplete atomic write command. Such amatch also means that the atomic write command was not acknowledged tothe host and that the L-Page information in the S-Journal(s) containingentries for the L-Page(s) specified by the atomic write command shouldnot be used to reconstruct the logical-to-physical address translationmap. Other than as modified herein, the reconstruction of thelogical-to-physical address translation map may be carried out accordingto the methods shown and described in commonly-assigned and co-pendingU.S. patent application Ser. No. 13/786,352 filed on Mar. 5, 2013 (Atty.Docket No. T5944), the disclosure of which is hereby incorporated hereinin its entirety.

In one embodiment, upon accessing the atomic sequence number in thepower-safe temporary storage, a check may be carried out, to ensure thevalidity of the CRC associated with the atomic sequence number.According to one embodiment, when the unique sequence number stored inthe S-Journal entry or entries for the atomic write command is not thesame as the atomic sequence number stored in the power-safe temporary,the S-Journal entry or entries are used to update thelogical-to-physical address translation map. However, according to oneembodiment, when a match occurs between the unique sequence numberstored in the S-Journal entry or entries for the atomic write commandand the unique sequence number stored in the power-safe temporarystorage during reconstruction, the S-Journal entry or entries are notused to update the logical-to-physical address translation map and theatomic write command will appear as if it never was executed.

Handling Partial Atomic Write Commands

In one embodiment, there is an additional process to address the relicsof a partial atomic write scenario. In one embodiment, the controllertracks additional information concerning the L-Page range affected by apartially completed atomic write command. As an example, when a matchoccurs in the sequence number during reconstruction indicating a partialatomic write command, a tracking table is consulted to determine theextent of data written by the partial atomic write command.

An example tracking table used in one embodiment, shown as threeversions corresponding to three time periods, is shown in FIGS. 16A-C.The three figures show how the tracking table tracks a partiallycompleted atomic write command. In the example shown, a partiallycompleted atomic command was intended to write to LPN (L-Page Number)100 through LPN 103, but had only written to LPN 100 and LPN 102(denoted by shaded boxes). The command was interrupted before LPN 101and LPN 103 could be written. FIG. 16A shows the tracking tableinitialized with default values at start-up, before the attemptedexecution of the command in question. In one embodiment, the MIN and MAXaddress fields are seeded with default address values of a maximum valuesuch as FFFFF and a minimum value such as 0, respectively.

In FIG. 16B, the tracking table has recorded the fact that the atomiccommand with Seq. No. N has written to LPN 100. An LPN written by anatomic write command with a matched atomic write sequence no. iscompared to both the MIN and the MAX field address values as follows. Ifthe written LPN is less than the current MIN value, the written LPNbecomes the current MIN value. In the example of FIG. 16A and FIG. 16B,since LPN 100 is less than FFFFF, LPN 100 replaces FFFFF as the MINvalue. Conversely, in the MAX field, if the written LPN is greater thanthe current MAX value, the written LPN becomes the current MAX value.Thus in the example LPN 100 also replaces 0 in the MAX field. FIG. 16Cshows the state of the table after LPN 102 is written. The MIN fieldremains unchanged since LPN 102 is greater than LPN 100, but the MAXfield is updated to LPN 102. Since each LPN of a command can be writtenout of order, the tracking table enables tracking of the range ofL-Pages affected by an atomic write command and enables recovery if thecommand does not complete. Over the course of execution, the MIN and MAXfields are filled and correlated with various atomic sequence numbers asshown in the figures.

In one embodiment, the tracking enables a clean-up process duringreconstruction. In one embodiment, as one of the final steps ofreconstruction, for each partially completed atomic command detected, acopy command is issued to copy the original data spanning from the MINaddress to the MAX address indicated in the tracking table, so that theoriginal data is re-written, thereby generating new S-Journal entries.This has the effect of eliminating the partial atomic write for futurepower cycles. Continuing with the present example in FIGS. 16A-C, upondetecting that the command with the Seq. No. N did not complete, theclean-up procedure will re-write LPN 100 through 102 so that theoriginal version of L-Pages at LPN 100 and 102 are rewritten in thenon-volatile memory and new S-Journal entries are generated to accountfor the new writes. Future reconstruction will correctly account for thefact that the atomic write didn't complete, as the latest S-Journalentries will indicate that the data in the affected address range havebeen reverted back to the original state before the failed atomic write.

According to one embodiment, the data storage device may reports that itis ready to process host commands shortly after having completed thereconstruction of the logical-to-physical address translation map (andoptionally other housekeeping activities such as populating theS-Journal map 904 and other tables in volatile memory). In such anembodiment, the data storage device is configured to carry out freespace accounting operations (including, e.g., rebuilding one or morefree space table(s)) while and/or after processing host (i.e., dataaccess) commands. Such incoming host commands may alter the free spaceaccounting of the S-Blocks. Such changes in the amount of valid datathat is present in each S-Block may be accounted for, according to oneembodiment. With respect to atomic write commands, according to oneembodiment, free space accounting, as described above, may be deferreduntil after all L-Pages storing data specified by the atomic writecommand have been stored in the non-volatile memory devices and theatomic write command is determined to have completed.

Garbage Collection

According to one embodiment, atomic sequence numbers affect the mannerin which garbage collection may be carried out, both on the user band(where user data may be stored) and the system band (which contains FileManagement System meta-data and information). When an S-Journal isparsed during garbage collection of the user band, and an atomic writeentry (identified by its header, for example) is encountered, the atomicsequence number in the specified slot may be checked against the atomicsequence number stored in the power-safe temporary storage (e.g., theMRAM, battery backed RAM, or other form of non-volatile RAM).

For example, the atomic sequence number may be stored in thenon-volatile buffer 211. If the two do not match, the atomic writecommand completed successfully and the L-Page(s) storing the dataspecified by the atomic write command may be copied and moved to anotherS-Block or S-Blocks. The header of the L-Page(s) may be stripped of itsatomic write attributes when generating the new S-Journal entry for thecopied and moved data. If, however, the atomic sequence number in thespecified slot matches (an unlikely event, as that S-Block wouldpresumably not have been picked for garbage collection) the atomicsequence number stored in the power-safe temporary storage, indicatingan in-process atomic write command, then the corresponding L-Page may becopied, kept atomic and an update may be carried out to the mappinginformation (such as 1108 in FIG. 11, for example) comprising anindication of the physical location of the L-Page storing the dataspecified by the atomic write command.

When an S-Journal is parsed during garbage collection of the systemband, and an atomic write entry (identified by its header, for example)is encountered, the atomic sequence number in the specified slot may bechecked against the atomic sequence number stored in the power-safetemporary storage (e.g., the MRAM). If the two do not match, the atomicwrite command completed successfully and the L-Page(s) storing the dataspecified by the atomic write command may be copied and moved to anotherS-Block or S-Blocks. In that case, header of the L-Page may be strippedof its atomic write attributes. If, however, the atomic sequence numberin the specified slot matches the atomic sequence number stored in thepower-safe temporary storage, indicating an in-process atomic writecommand, then the corresponding L-Page may be copied and moved toanother S-Block or S-Blocks, keeping the header indicative of an atomicwrite and the mapping information (such as 1108 in FIG. 11, for example)comprising an indication of the physical location of the L-Page storingthe data specified by the atomic write command may be suitably updated.

Summary—Handling Atomic Write Commands

FIG. 13 is a flowchart of a method of performing an atomic write commandin a data storage device comprising a volatile memory and a plurality ofnonvolatile memory devices that are configured to store a plurality ofphysical pages. As shown therein, block B131 calls for receiving anatomic write command, whereupon the data specified by the atomic writecommand may be stored in one or more L-Pages, as shown at B132. At B133,it may be determined whether all L-Pages of data specified by the atomicwrite command have been stored in the non-volatile storage devices(e.g., Flash). B132 may be carried out again (NO Branch of B133) untilall L-Pages of the atomic write command have, in fact, been stored (YESBranch of B133). This operates to defer updates to thelogical-to-physical address translation map until all such L-Pages ofthe atomic write command have been stored. When all L-Pages of dataspecified by the atomic write command have been stored, thelogical-to-physical address translation map may be updated with the oneor more L-Pages storing the data specified by the atomic write command,as shown at B134.

According to one embodiment, blocks B132A1 and B132A2 may be carried outbetween blocks B132 and B133—that is, prior to updating thelogical-to-physical address translation map. As shown at B132A1, mappinginformation (such as 1108 in FIG. 11, for example) comprising anindication of the physical location of each L-Page storing the dataspecified by the atomic write command may be stored in volatile memory.Also, for each L-Page storing data specified by the atomic writecommand, an S-Journal entry may be generated, as shown at B132A2. Thisgenerated S-Journal entry may be configured, as shown in FIG. 13, toenable a determination of whether the atomic write command has completedor has not completed upon reconstruction of the logical-to-physicaladdress translation map. Such S-Journal entry, as shown in FIG. 12, maycomprise an L-Page and an indication of the atomic sequence number andthe slot number where such sequence number is stored. In one embodiment,the header of an L-Page written by an atomic write command contains thesame atomic sequence number and slot number information. This allows forreconstruction even if the corresponding S-Journal entry is not written.The reconstruction process is configured in one embodiment to processL-Pages for which no journal entries were found, and use the headerinformation in those L-Pages to rebuild the mapping table.

As shown in FIG. 14, the atomic write may be committed by changing theatomic sequence number in the power-safe temporary storage (e.g., 211 inFIG. 2) such that the changed atomic sequence number does not match theatomic sequence number in the generated S-Journal entry or entries forthat atomic write command, as show at B141. Thereafter, the atomic writecommand may be considered to have been completed and an atomic writecomplete acknowledgement is sent to the host, as shown at B142.According to one embodiment, blocks B132A1 and B132A2 may be carried outbetween blocks B132 and B133—that is, prior to updating thelogical-to-physical address translation map. As shown at B132A1, mappinginformation (such as 1108 in FIG. 11, for example) comprising anindication of the physical location of each L-Page storing the dataspecified by the atomic write command may be stored in volatile memory.Also, for each L-Page storing data specified by the atomic writecommand, an S-Journal entry may be generated, as shown at B132A2. Thisgenerated S-Journal entry may be configured, as shown in FIG. 13, toenable a determination of whether the atomic write command has completedor has not completed upon reconstruction of the logical-to-physicaladdress translation map. Such S-Journal entry, as shown in FIG. 12, maycomprise an L-Page and an indication of the atomic sequence number andthe slot number where such sequence number is stored. In one embodiment,the header of an L-Page written by an atomic write command contains thesame atomic sequence number and slot number information. This allows forreconstruction even if the corresponding S-Journal entry is not written.The reconstruction process is configured in one embodiment to processL-Pages for which no journal entries were found, and use the headerinformation in those L-Pages to rebuild the mapping table.

Atomic Commands Writing Scattered Information

In some embodiments, an atomic write command may involve writing to LBAs(e.g., L-Pages) that are scattered across the range of available LBAs.Also, multiple write commands writing to different ranges can beindicated as atomic by the host, so that all the write commands mustcomplete as a group or not at all. This is useful, for example, in afinancial transaction where funds need to be debited from one accountand credited to another, and the writes to record such a transactionneed to be atomic. The account records may be scattered in differentlocations within the data storage device. In addition, many relationaldatabase applications have rollback features, and the ability toatomically write scattered LBAs in a data storage device supporting thedatabase applications may significantly enhance the performance of suchdatabase applications.

FIG. 17 shows how one embodiment handles the situations when multipledisparate LBAs are specified in one host atomic command (or in multiplehost commands that are grouped together into one atomic operation). Asshown in the figure, the LBAs are broken into multiple ranges in whicheach range consists of contiguous LBAs. In the example embodiment shownin FIG. 17, three “sub-commands” are generated, with each sub-commandhandling a range. The processing of each sub-command is mostly the sameas that of the single atomic command as previously described above, withsome minor changes in the commit procedure.

As shown at reference numeral 1702, each sub-command with its associatedLBA range is assigned an atomic sequence number in a unique slot andprocessed atomically independently of the other sub-commands (and theirassociated LBA ranges). In one embodiment, the atomic commit process foreach sub-command (i.e., the updating of the sequence number at theassigned slot at the commit phase) is held off until all sub-commands(LBA ranges) have been successfully written. At that point, the atomiccommit is performed for all of the sequence numbers at the associatedslots at the same time.

To maintain atomicity in the event of power failure events, since it ispossible for the atomic commit to complete for only a subset of the slotnumbers prior to the power failure, in one embodiment a list of all slotnumbers associated with an atomic operation (a host command or an atomicgroup of host command(s)) is stored in a power safe manner (e.g., innon-volatile memory). This is shown in the example list 1704 in FIG. 17,where slot nos. 1, 5, and 6 are saved as being associated with the sameatomic operation. This provides a mechanism during the reconstructionprocess to discover the set of atomic slot numbers associated with theatomic operation, and in turn, the atomic commit status of each slotnumber, which enables the controller to complete the remaining atomiccommits. Using the example of FIG. 17, if the commit finished in slotnos. 1 and 5 but not in slot No. 6 before the power cycle, such acondition would be detected at reconstruction by virtue of the savedlist 1704, and the atomic commit at slot No. 6 can be performed at thattime. Alternatively, each commit operation at a particular slot could beprotected from power interruption by reserved power (e.g., capacitors)so the overall commit scheme can be power safe.

FIG. 18 is a flow diagram showing the handling of atomic command(s) thatmay write scattered information according to one embodiment. The actionsshown may be performed by firmware executed on the controller, hardwareautomation, or a combination of both. At 1801, one or more host atomiccommand(s) writing scattered information is received. A host mayindicate that a command is atomic by a bit flag or other similarmechanisms. In one embodiment, a group of commands may be indicated asatomic by virtue of: (1) a field in each command (e.g., grouped atomic)indicating whether the command is part of a grouped atomic operation,and (2) another field in each command noting an atomic group identifier,indicating to which atomic group the command is assigned.

At 1802, the LBA ranges affected by the received command(s) areobtained. This could be done, for example, by obtaining the LBA rangesfrom the host, or by extracting them from the command(s). At 1803, anatomic slot (with an unique sequence number) is assigned to eachcontiguous LBA range. Optionally, at 1804, in one embodiment, theassignments (e.g., the list 1704) are saved in a power-safe manner, sothat if a power cycle occurs before all commits are performed, theunfinished commits can be completed after the power cycle, as describedabove. At 1805, the ranges of LBAs are written in a manner similar tothat as described above with respect to the generic, single atomic writecommand case, and the atomic commits are performed when the writes arecompleted. For example, in one embodiment, the atomic commits at theslots are delayed until the last write is completed, at which point allthe commits are executed to ensure atomicity. The delay of theindividual commits until all writes are complete ensures that, if apower cycle were to occur, the system could revert to the state beforeany portion of the atomic operation was started, since each slot wouldindicate an in-progress atomic write. Then reconstruction can revert tothe prior state by processing each range of LBA (i.e., each slot) aswith the single atomic command case, as previously described above. At1806, a completion acknowledgement is sent to the host once all thecommits are completed at the slots.

CONCLUSION

While certain embodiments of the disclosure have been described, theseembodiments have been presented by way of example only, and are notintended to limit the scope of the disclosure. Indeed, the novelmethods, devices and systems described herein may be embodied in avariety of other forms. Furthermore, various omissions, substitutionsand changes in the form of the methods and systems described herein maybe made without departing from the spirit of the disclosure. Forexample, those skilled in the art will appreciate that in variousembodiments, the actual physical and logical structures may differ fromthose shown in the figures. Depending on the embodiment, certain stepsdescribed in the example above may be removed, others may be added.Also, the features and attributes of the specific embodiments disclosedabove may be combined in different ways to form additional embodiments,all of which fall within the scope of the present disclosure. Althoughthe present disclosure provides certain preferred embodiments andapplications, other embodiments that are apparent to those of ordinaryskill in the art, including embodiments which do not provide all of thefeatures and advantages set forth herein, are also within the scope ofthis disclosure.

What is claimed is:
 1. A method of performing an operation in a datastorage device, the method comprising: receiving a plurality of atomicwrite commands; grouping together the plurality of atomic write commandsinto a single atomic operation; obtaining a plurality of ranges oflogical addresses affected by the plurality of atomic write commands;for each of the plurality of affected ranges, assigning metadatainformation to track completion of a write operation performed at thatrange; performing the write operations in the ranges of logicaladdresses; updating the metadata information upon completion of thewrite operations in the ranges; and deferring an update to a translationmap of the data storage device until the metadata information has beenupdated.
 2. The method of claim 1, wherein the grouping comprises:assigning metadata to each of the plurality of atomic write commandsindicating that each of the plurality of atomic write commands is partof a grouped atomic operation; and assigning metadata to each of theplurality of atomic write commands indicating an atomic groupidentifier.
 3. The method of claim 1, wherein each range in theplurality of ranges of logical addresses affected is non-contiguous. 4.The method of claim 1, wherein the metadata for each range is unique tothe write operation performed at the associated range of logicaladdresses.
 5. The method of claim 4, wherein the metadata for each writeoperation associated with the plurality of atomic write commandscomprises: a sequence number of a plurality of sequence numbers assignedto each of the plurality of ranges; and a slot number configured tostore one of the sequence numbers.
 6. The method of claim 5, whereinupdating the metadata information upon completion of the writeoperations in the ranges comprises: changing, in the power-safe storage,the sequence numbers associated with the plurality of atomic writecommands after all the write operations are completed.
 7. The method ofclaim 5, wherein a list of all the slot numbers associated with theplurality of ranges and a status of the write operation performed at theassociated range of logical addresses are stored in a power-safestorage.
 8. The method of claim 6, wherein the changing comprises, whena write operation performed at one of the ranges completes, changing thesequence number associated with that range.
 9. The method of claim 8,further comprising: after all the data specified by the plurality ofatomic write commands have been written and the metadata updated,updating the translation map; and acknowledging a completion of theplurality of atomic write commands to a host coupled to the data storagedevice after the translation map has been updated.
 10. A controller in adata storage device, the controller comprising: a processor configuredto: receive a plurality of atomic write commands; group together theplurality of atomic write commands into a single atomic operation;obtain a plurality of ranges of logical addresses affected by theplurality of atomic write commands; for each of the plurality ofaffected ranges, assign metadata information to track completion of awrite operation performed at that range; perform the write operations inthe ranges of logical addresses; update the metadata information uponcompletion of the write operations in the ranges; and defer an update toa translation map of the data storage device until the metadatainformation has been updated.
 11. The controller of claim 10, whereinthe processor is configured to group together the plurality of atomicwrite commands by: assigning metadata to each of the plurality of atomicwrite commands indicating that each of the plurality of atomic writecommands is part of a grouped atomic operation; and assigning metadatato each of the plurality of atomic write commands indicating an atomicgroup identifier.
 12. The controller of claim 10, wherein each range inthe plurality of ranges of logical addresses affected is non-contiguous.13. The controller of claim 10, wherein the metadata for each range isunique to the write operation performed at the associated range oflogical addresses.
 14. The controller of claim 13, wherein the metadatafor each write operation associated with the plurality of atomic writecommands comprises: a sequence number of a plurality of sequence numbersassigned to each of the plurality of ranges; and a slot numberconfigured to store one of the sequence numbers.
 15. The controller ofclaim 14, wherein updating the metadata information upon completion ofthe write operations in the ranges comprises: changing, in thepower-safe storage, the sequence numbers associated with the pluralityof atomic write commands after all the write operations are completed.16. The controller of claim 14, wherein a list of all the slot numbersassociated with the plurality of ranges and a status of the writeoperation performed at the associated range of logical addresses arestored in a power-safe storage.
 17. The controller of claim 15, whereinthe changing comprises, when a write operation performed at one of theranges completes, changing the sequence number associated with thatrange.
 18. The controller of claim 10, configured to: after all the dataspecified by the plurality of atomic write commands have been writtenand the metadata updated, update the translation map; and acknowledge acompletion of the plurality of atomic write commands to a host coupledto the data storage device after the translation map has been updated.19. A data storage device comprising: a plurality of non-volatilesolid-state memory devices; and a controller comprising: a processorconfigured to: receive a plurality of atomic write commands; grouptogether the plurality of atomic write commands into a single atomicoperation; obtain a plurality of ranges of logical addresses affected bythe plurality of atomic write commands; for each of the plurality ofaffected ranges, assign metadata information to track completion of awrite operation performed at that range; perform the write operations inthe ranges of logical addresses; update the metadata information uponcompletion of the write operations in the ranges; and defer an update toa translation map of the data storage device until the metadatainformation has been updated.
 20. The data storage device of claim 19,wherein the processor is configured to group together the plurality ofatomic write commands by: assigning metadata to each of the plurality ofatomic write commands indicating that each of the plurality of atomicwrite commands is part of a grouped atomic operation; and assigningmetadata to each of the plurality of atomic write commands indicating anatomic group identifier.